Goto

Collaborating Authors

 knn model


Object Classification Utilizing Neuromorphic Proprioceptive Signals in Active Exploration: Validated on a Soft Anthropomorphic Hand

Wang, Fengyi, Fu, Xiangyu, Thakor, Nitish, Cheng, Gordon

arXiv.org Artificial Intelligence

Proprioception, a key sensory modality in haptic perception, plays a vital role in perceiving the 3D structure of objects by providing feedback on the position and movement of body parts. The restoration of proprioceptive sensation is crucial for enabling in-hand manipulation and natural control in the prosthetic hand. Despite its importance, proprioceptive sensation is relatively unexplored in an artificial system. In this work, we introduce a novel platform that integrates a soft anthropomorphic robot hand (QB SoftHand) with flexible proprioceptive sensors and a classifier that utilizes a hybrid spiking neural network with different types of spiking neurons to interpret neuromorphic proprioceptive signals encoded by a biological muscle spindle model. The encoding scheme and the classifier are implemented and tested on the datasets we collected in the active exploration of ten objects from the YCB benchmark. Our results indicate that the classifier achieves more accurate inferences than existing learning approaches, especially in the early stage of the exploration. This system holds the potential for development in the areas of haptic feedback and neural prosthetics.


Bags of Projected Nearest Neighbours: Competitors to Random Forests?

Hofmeyr, David P.

arXiv.org Machine Learning

In this paper we introduce a simple and intuitive adaptive k nearest neighbours classifier, and explore its utility within the context of bootstrap aggregating ("bagging"). The approach is based on finding discriminant subspaces which are computationally efficient to compute, and are motivated by enhancing the discrimination of classes through nearest neighbour classifiers. This adaptiveness promotes diversity of the individual classifiers fit across different bootstrap samples, and so further leverages the variance reducing effect of bagging. Extensive experimental results are presented documenting the strong performance of the proposed approach in comparison with Random Forest classifiers, as well as other nearest neighbours based ensembles from the literature, plus other relevant benchmarks. Code to implement the proposed approach is available in the form of an R package from https://github.com/DavidHofmeyr/BOPNN.


Reduced-order modeling and classification of hydrodynamic pattern formation in gravure printing

Rothmann-Brumm, Pauline, Brunton, Steven L., Scherl, Isabel

arXiv.org Artificial Intelligence

Hydrodynamic pattern formation phenomena in printing and coating processes are still not fully understood. However, fundamental understanding is essential to achieve high-quality printed products and to tune printed patterns according to the needs of a specific application like printed electronics, graphical printing, or biomedical printing. The aim of the paper is to develop an automated pattern classification algorithm based on methods from supervised machine learning and reduced-order modeling. We use the HYPA-p dataset, a large image dataset of gravure-printed images, which shows various types of hydrodynamic pattern formation phenomena. It enables the correlation of printing process parameters and resulting printed patterns for the first time. 26880 images of the HYPA-p dataset have been labeled by a human observer as dot patterns, mixed patterns, or finger patterns; 864000 images (97%) are unlabeled. A singular value decomposition (SVD) is used to find the modes of the labeled images and to reduce the dimensionality of the full dataset by truncation and projection. Selected machine learning classification techniques are trained on the reduced-order data. We investigate the effect of several factors, including classifier choice, whether or not fast Fourier transform (FFT) is used to preprocess the labeled images, data balancing, and data normalization. The best performing model is a k-nearest neighbor (kNN) classifier trained on unbalanced, FFT-transformed data with a test error of 3%, which outperforms a human observer by 7%. Data balancing slightly increases the test error of the kNN-model to 5%, but also increases the recall of the mixed class from 90% to 94%. Finally, we demonstrate how the trained models can be used to predict the pattern class of unlabeled images and how the predictions can be correlated to the printing process parameters, in the form of regime maps.


Simple Perturbations Subvert Ethereum Phishing Transactions Detection: An Empirical Analysis

Alghureid, Ahod, Mohaisen, David

arXiv.org Artificial Intelligence

This paper explores the vulnerability of machine learning models, specifically Random Forest, Decision Tree, and K-Nearest Neighbors, to very simple single-feature adversarial attacks in the context of Ethereum fraudulent transaction detection. Through comprehensive experimentation, we investigate the impact of various adversarial attack strategies on model performance metrics, such as accuracy, precision, recall, and F1-score. Our findings, highlighting how prone those techniques are to simple attacks, are alarming, and the inconsistency in the attacks' effect on different algorithms promises ways for attack mitigation. We examine the effectiveness of different mitigation strategies, including adversarial training and enhanced feature selection, in enhancing model robustness.


PEFA: Parameter-Free Adapters for Large-scale Embedding-based Retrieval Models

Chang, Wei-Cheng, Jiang, Jyun-Yu, Zhang, Jiong, Al-Darabsah, Mutasem, Teo, Choon Hui, Hsieh, Cho-Jui, Yu, Hsiang-Fu, Vishwanathan, S. V. N.

arXiv.org Artificial Intelligence

Embedding-based Retrieval Models (ERMs) have emerged as a promising framework for large-scale text retrieval problems due to powerful large language models. Nevertheless, fine-tuning ERMs to reach state-of-the-art results can be expensive due to the extreme scale of data as well as the complexity of multi-stages pipelines (e.g., pre-training, fine-tuning, distillation). In this work, we propose the PEFA framework, namely ParamEter-Free Adapters, for fast tuning of ERMs without any backward pass in the optimization. At index building stage, PEFA equips the ERM with a non-parametric k-nearest neighbor (kNN) component. At inference stage, PEFA performs a convex combination of two scoring functions, one from the ERM and the other from the kNN. Based on the neighborhood definition, PEFA framework induces two realizations, namely PEFA-XL (i.e., extra large) using double ANN indices and PEFA-XS (i.e., extra small) using a single ANN index. Empirically, PEFA achieves significant improvement on two retrieval applications. For document retrieval, regarding Recall@100 metric, PEFA improves not only pre-trained ERMs on Trivia-QA by an average of 13.2%, but also fine-tuned ERMs on NQ-320K by an average of 5.5%, respectively. For product search, PEFA improves the Recall@100 of the fine-tuned ERMs by an average of 5.3% and 14.5%, for PEFA-XS and PEFA-XL, respectively. Our code is available at https://github.com/amzn/pecos/tree/mainline/examples/pefa-wsdm24.


Current Topological and Machine Learning Applications for Bias Detection in Text

Farrelly, Colleen, Singh, Yashbir, Hathaway, Quincy A., Carlsson, Gunnar, Choudhary, Ashok, Paul, Rahul, Doretto, Gianfranco, Himeur, Yassine, Atalls, Shadi, Mansoor, Wathiq

arXiv.org Artificial Intelligence

Institutional bias can impact patient outcomes, educational attainment, and legal system navigation. Written records often reflect bias, and once bias is identified; it is possible to refer individuals for training to reduce bias. Many machine learning tools exist to explore text data and create predictive models that can search written records to identify real-time bias. However, few previous studies investigate large language model embeddings and geometric models of biased text data to understand geometry's impact on bias modeling accuracy. To overcome this issue, this study utilizes the RedditBias database to analyze textual biases. Four transformer models, including BERT and RoBERTa variants, were explored. Post-embedding, t-SNE allowed two-dimensional visualization of data. KNN classifiers differentiated bias types, with lower k-values proving more effective. Findings suggest BERT, particularly mini BERT, excels in bias classification, while multilingual models lag. The recommendation emphasizes refining monolingual models and exploring domain-specific biases.


Making informed decisions in cutting tool maintenance in milling: A KNN based model agnostic approach

Rahalkar, Aditya M., Khare, Om M., Patange, Abhishek D.

arXiv.org Artificial Intelligence

In machining processes, monitoring the condition of the tool is a crucial aspect to ensure high productivity and quality of the product. Using different machine learning techniques in Tool Condition Monitoring (TCM) enables a better analysis of the large amount of data of different signals acquired during the machining processes. The real-time force signals encountered during the process were acquired by performing numerous experiments. Different tool wear conditions were considered during the experimentation. A comprehensive statistical analysis of the data and feature selection using decision trees was conducted, and the KNN algorithm was used to perform classification. Hyperparameter tuning of the model was done to improve the model's performance. Much research has been done to employ machine learning approaches in tool condition monitoring systems; however, a model-agnostic approach to increase the interpretability of the process and get an in-depth understanding of how the decision-making is done is not implemented by many. This research paper presents a KNN-based white box model, which allows us to dive deep into how the model performs the classification and how it prioritizes the different features included. This approach helps in detecting why the tool is in a certain condition and allows the manufacturer to make an informed decision about the tool's maintenance.


Optimizing Data Shapley Interaction Calculation from O(2^n) to O(t n^2) for KNN models

Belaid, Mohamed Karim, Mekki, Dorra El, Rabus, Maximilian, Hüllermeier, Eyke

arXiv.org Artificial Intelligence

With the rapid growth of data availability and usage, quantifying the added value of each training data point has become a crucial process in the field of artificial intelligence. The Shapley values have been recognized as an effective method for data valuation, enabling efficient training set summarization, acquisition, and outlier removal. In this paper, we introduce "STI-KNN", an innovative algorithm that calculates the exact pair-interaction Shapley values for KNN models in O(t n^2) time, which is a significant improvement over the O(2^n)$ time complexity of baseline methods. By using STI-KNN, we can efficiently and accurately evaluate the value of individual data points, leading to improved training outcomes and ultimately enhancing the effectiveness of artificial intelligence applications.


ChMusic: A Traditional Chinese Music Dataset for Evaluation of Instrument Recognition

Gong, Xia, Zhu, Yuxiang, Zhu, Haidi, Wei, Haoran

arXiv.org Artificial Intelligence

Musical instruments recognition is a widely used application for music information retrieval. As most of previous musical instruments recognition dataset focus on western musical instruments, it is difficult for researcher to study and evaluate the area of traditional Chinese musical instrument recognition. This paper propose a traditional Chinese music dataset for training model and performance evaluation, named ChMusic. This dataset is free and publicly available, 11 traditional Chinese musical instruments and 55 traditional Chinese music excerpts are recorded in this dataset. Then an evaluation standard is proposed based on ChMusic dataset. With this standard, researchers can compare their results following the same rule, and results from different researchers will become comparable.


Machine Learning -- K-Nearest Neighbors algorithm with Python

#artificialintelligence

'K-Nearest Neighbors (KNN) is a model that classifies data points based on the points that are most similar to it. It uses test data to make an "educated guess" on what an unclassified point should be classified as' We will be building our KNN model using python's most popular machine learning package'scikit-learn'. Scikit-learn provides data scientists with various tools for performing machine learning tasks. For our KNN model, we are going to use the'KNeighborsClassifier' algorithm which is readily available in scikit-learn package. Finally, we will evaluate our KNN model predictions using the'accuracy score' function in scikit-learn.